Reinforcement Learning In Deepseek-R1 Visually Explained